Gold Standard Online Debates Summaries and First Experiments Towards Automatic Summarization of Online Debate Data
نویسندگان
چکیده
Usage of online textual media is steadily increasing. Daily, more and more news stories, blog posts and scientific articles are added to the online volumes. These are all freely accessible and have been employed extensively in multiple research areas, e.g. automatic text summarization, information retrieval, information extraction, etc. Meanwhile, online debate forums have recently become popular, but have remained largely unexplored. For this reason, there are no sufficient resources of annotated debate data available for conducting research in this genre. In this paper, we collected and annotated debate data for an automatic summarization task. Similar to extractive gold standard summary generation our data contains sentences worthy to include into a summary. Five human annotators performed this task. Inter-annotator agreement, based on semantic similarity, is 36% for Cohen’s kappa and 48% for Krippendorff’s alpha. Moreover, we also implement an extractive summarization system for online debates and discuss prominent features for the task of summarizing online debate data automatically.
منابع مشابه
Automatic Summarization of Online Debates
Debate summarization is one of the novel and challenging research areas in automatic text summarization which has been largely unexplored. In this paper, we develop a debate summarization pipeline to summarize key topics which are discussed or argued in the two opposing sides of online debates. We view that the generation of debate summaries can be achieved by clustering, cluster labeling, and ...
متن کاملResQu: A Framework for Automatic Evaluation of Knowledge-Driven Automatic Summarization
JAYKUMAR, NISHITA. M.S., Department of Computer Science and Engineering, Wright State University, 2016. ResQu: A Framework for Automatic Evaluation of Knowledge-Driven Automatic Summarization. Automatic generation of summaries that capture the salient aspects of a search resultset (i.e., automatic summarization) has become an important task in biomedical research. Automatic summarization offers...
متن کاملUnderstanding Human Preferences for Summary Designs in Online Debates Domain
Research on automatic text summarization has primarily focused on summarizing news, web pages, scientific papers, etc. While in some of these text genres, it is intuitively clear what constitutes a good summary, the issue is much less clear cut in social media scenarios like online debates, product reviews, etc., where summaries can be presented in many ways. As yet, there is no analysis about ...
متن کاملAn Investigation of the Online Farsi Translation of Metadiscourse Markers in American Presidential Debates
The term metadiscourse rarely appears in translation studies despite the continuously growing body of research on discourse markers in different genres and through various perspectives. Translation as a product that needs to observe such markers for their communicative power and contribution to the overall coherence of a text within a context has not been satisfactorily studied. Motivated by su...
متن کاملAnalysis of Human Summaries for Automatic Summarization
The current “Information Explosion” necessitates methods to reduce the vast amounts of text found from online sites and other sources through automated summarization. As computationally complex as automated text summarization may be, improved Natural Language Processing methods and closer semantic analysis are progressively used for overall summarization improvement. The purpose of this study i...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1708.04592 شماره
صفحات -
تاریخ انتشار 2017